Skip to content

feat(runtime-fallback): automatic model switching on API errors#1408

Closed
youming-ai wants to merge 1 commit intocode-yeongyu:devfrom
youming-ai:feat/runtime-fallback-only
Closed

feat(runtime-fallback): automatic model switching on API errors#1408
youming-ai wants to merge 1 commit intocode-yeongyu:devfrom
youming-ai:feat/runtime-fallback-only

Conversation

@youming-ai
Copy link
Contributor

@youming-ai youming-ai commented Feb 3, 2026

Summary

Implements runtime model fallback that automatically switches to backup models when the primary model encounters transient errors (rate limits, overload, credit balance issues, etc.).

Background

This is a cleaned-up version of #1237. The original PR contained both:

  1. ✅ (init-time) - Already merged to dev via commit 81db76f
  2. ✅ (runtime) - This PR

The original PR had merge conflicts due to the already-merged feature. This new PR contains only the unique runtime-fallback functionality.

Changes

Core Implementation

  • Add runtime_fallback configuration with customizable error codes, cooldown, and notifications
  • Implement runtime-fallback hook that intercepts API errors (400, 429, 503, 529)
  • Support fallback_models from agent/category configuration
  • Full TypeScript types and comprehensive tests

Integration (Fixed in review)

  • Wire runtimeFallback.event() into src/plugin/event.ts for session.error events
  • Wire runtimeFallback["chat.message"]() into src/plugin/chat-message.ts for assistant message errors
  • Register SessionCategoryRegistry on task creation in sync-task.ts and background manager

Bug Fixes (Fixed in review)

  • Fix hyphenated agent detection by sorting agent names by length before matching
  • Fix extractStatusCode to honor user-configured retry_on_errors (not hardcoded)
  • Add cleanupInterval.unref() and dispose() method to prevent memory leaks
  • Increase max_fallback_attempts schema limit from 10 to 20
  • Add HTTP 400 to default retry_on_errors for Anthropic credit balance errors
  • Replace overly broad /try.?again/i pattern with specific /credit.?balance/i and /billing/i

Documentation

  • Remove duplicate "Runtime Fallback" section from configurations.md

Configuration Example

{
  "runtime_fallback": {
    "enabled": true,
    "retry_on_errors": [400, 429, 503, 529],
    "max_fallback_attempts": 3,  // max: 20
    "cooldown_seconds": 60,
    "notify_on_fallback": true
  },
  "agents": {
    "sisyphus": {
      "model": "anthropic/claude-opus-4-5",
      "fallback_models": ["openai/gpt-5.2", "google/gemini-3-pro"]
    }
  }
}

Supported Error Codes

Code Description Provider
400 Credit balance / billing errors Anthropic
429 Rate limit exceeded All
503 Service unavailable All
529 Overloaded Anthropic

Testing

  • bun run typecheck
  • bun test

Commits

  • 62fac11f - Initial implementation
  • eaf52ca4 - Address cubic AI review issues
  • 4aed41be - Fix hyphenated agent detection
  • bb4636b4 - Wire hook into plugin dispatchers and fix production issues
  • 3a4f6126 - Increase max attempts to 20, add 400 status for Anthropic errors

Supersedes #1237

@github-actions
Copy link
Contributor

github-actions bot commented Feb 3, 2026

All contributors have signed the CLA. Thank you! ✅
Posted by the CLA Assistant Lite bot.

Copy link

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 5ba495f53a

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment on lines 90 to 94
if (agent && pluginConfig.agents?.[agent as keyof typeof pluginConfig.agents]) {
const agentConfig = pluginConfig.agents[agent as keyof typeof pluginConfig.agents]
if (agentConfig?.fallback_models) {
return normalizeFallbackModels(agentConfig.fallback_models)
}

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Use category fallback_models when resolving fallbacks

This resolver only looks at pluginConfig.agents (and a sessionID heuristic) to find fallback models, so any fallback_models configured under categories never take effect. If an agent inherits its model via a category (which is the common configuration path), the hook will still log “No fallback models configured” and skip fallback even though the category has them. Consider resolving the agent’s category (from agent config or event info) and falling back to pluginConfig.categories[category].fallback_models when agent-specific overrides are absent.

Useful? React with 👍 / 👎.

Comment on lines 112 to 116
if (!state.failedModels.has(model)) return false

const cooldownMs = cooldownSeconds * 1000
const timeSinceLastFallback = Date.now() - state.lastFallbackTime

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Track cooldown per failed model to avoid global lockout

Cooldown is computed from a single lastFallbackTime for all failed models, so any fallback resets the cooldown window for every model in failedModels. In sessions where multiple fallbacks happen quickly, older models can stay blocked longer than the configured cooldown_seconds, and the list can be exhausted even though some models should be eligible again. Store per-model failure timestamps (e.g., a map of model → lastFailedAt) so the cooldown applies to each model independently.

Useful? React with 👍 / 👎.

Copy link

@cubic-dev-ai cubic-dev-ai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

4 issues found across 8 files

Confidence score: 3/5

  • Category-level fallback_models are ignored in src/hooks/runtime-fallback/index.ts, so agents inheriting category config may not fall back as intended, which could reduce reliability under failure.
  • Cooldown tracking in src/hooks/runtime-fallback/index.ts uses a single lastFallbackTime for all models, potentially extending cooldowns incorrectly when multiple models fail, which can skew fallback behavior.
  • Overall risk is moderate due to multiple runtime-fallback logic issues that could affect retry/fallback decisions in production flows.
  • Pay close attention to src/hooks/runtime-fallback/index.ts, src/hooks/runtime-fallback/constants.ts - fallback selection and error classification logic may misbehave under certain conditions.
Prompt for AI agents (all issues)

Check if these issues are valid — if so, understand the root cause of each and fix them.


<file name="src/hooks/runtime-fallback/constants.ts">

<violation number="1" location="src/hooks/runtime-fallback/constants.ts:32">
P2: Unanchored numeric regexes (`/429/`, `/503/`, `/529/`) will match any occurrence of those digits inside larger numbers, causing false retry/fallback classification on unrelated error messages.</violation>
</file>

<file name="src/hooks/runtime-fallback/index.ts">

<violation number="1" location="src/hooks/runtime-fallback/index.ts:97">
P2: Fallback agent detection is limited to a hardcoded regex list, so custom agents in config won’t be recognized from sessionID when the event lacks an agent, preventing configured fallback models from being used.</violation>

<violation number="2" location="src/hooks/runtime-fallback/index.ts:108">
P2: The fallback model resolver only checks agent-specific `fallback_models` but ignores category-level fallback configurations. Since `CategoryConfigSchema` supports `fallback_models` and agents commonly inherit settings from categories, this function should also resolve the agent's category and check `pluginConfig.categories[category].fallback_models` when agent-specific fallbacks are not defined.</violation>

<violation number="3" location="src/hooks/runtime-fallback/index.ts:115">
P2: Cooldown tracking uses a single `lastFallbackTime` timestamp for all models, which causes incorrect cooldown behavior. When multiple models fail in sequence, earlier failures have their cooldown window extended by later failures. Consider storing per-model failure timestamps (e.g., `failedModels: Map<string, number>` instead of `Set<string>`) so each model's cooldown is tracked independently.</violation>
</file>

Reply with feedback, questions, or to request a fix. Tag @cubic-dev-ai to re-run a review.

Copy link

@cubic-dev-ai cubic-dev-ai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

1 issue found across 2 files (changes from recent commits).

Prompt for AI agents (all issues)

Check if these issues are valid — if so, understand the root cause of each and fix them.


<file name="src/hooks/runtime-fallback/index.test.ts">

<violation number="1" location="src/hooks/runtime-fallback/index.test.ts:476">
P2: Test assertion permits the "No fallback models configured" path, so the new tests can pass without exercising any fallback switching logic; with the provided mock config (no fallback_models), these tests become ineffective and mask real failures.</violation>
</file>

Reply with feedback, questions, or to request a fix. Tag @cubic-dev-ai to re-run a review.

Copy link

@cubic-dev-ai cubic-dev-ai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

2 issues found across 7 files (changes from recent commits).

Prompt for AI agents (all issues)

Check if these issues are valid — if so, understand the root cause of each and fix them.


<file name="src/tools/delegate-task/executor.ts">

<violation number="1" location="src/tools/delegate-task/executor.ts:552">
P2: `executeSyncTask` registers the session category but never removes it, leaving `SessionCategoryRegistry` entries to accumulate for each sync session. Consider removing the session from the registry on completion/error (similar to `subagentSessions.delete`).</violation>
</file>

<file name="src/agents/utils.ts">

<violation number="1" location="src/agents/utils.ts:307">
P3: Duplicated fallback_models normalization logic is repeated in four places. This should be centralized (e.g., a helper) to avoid inconsistent changes later.</violation>
</file>

Reply with feedback, questions, or to request a fix. Tag @cubic-dev-ai to re-run a review.

@youming-ai youming-ai force-pushed the feat/runtime-fallback-only branch from 2787659 to 78bb36a Compare February 3, 2026 06:30
@youming-ai youming-ai mentioned this pull request Feb 3, 2026
@youming-ai youming-ai force-pushed the feat/runtime-fallback-only branch from a0c1a63 to adae0f7 Compare February 4, 2026 06:27
@code-yeongyu code-yeongyu self-assigned this Feb 4, 2026
@youming-ai youming-ai force-pushed the feat/runtime-fallback-only branch from adae0f7 to cea5f43 Compare February 4, 2026 10:49
@youming-ai youming-ai force-pushed the feat/runtime-fallback-only branch from 2090fe3 to c375961 Compare February 8, 2026 10:27
youming-ai added a commit to youming-ai/oh-my-opencode that referenced this pull request Feb 9, 2026
- Fix bun.lock version conflicts (3.3.1 -> 3.3.2)
- Remove Git conflict markers from docs/configurations.md
- Remove duplicate normalizeFallbackModels, import from shared module
youming-ai added a commit to youming-ai/oh-my-opencode that referenced this pull request Feb 9, 2026
- Remove duplicate Runtime Fallback documentation section
- Add TTL/cleanup mechanism to prevent session state memory leaks
  - Track session last access time
  - Periodic cleanup every 5 minutes
  - 30-minute TTL for stale sessions
- Fix fragile agent name detection regex
  - Use explicit AGENT_NAMES array
  - Properly handle hyphens in agent names
- Fix inconsistent BDD comment pattern (//#when -> //#given)
youming-ai added a commit to youming-ai/oh-my-opencode that referenced this pull request Feb 9, 2026
Implements runtime model fallback that automatically switches to backup models
when the primary model encounters transient errors (rate limits, overload, etc.).

Features:
- runtime_fallback configuration with customizable error codes, cooldown, notifications
- Runtime fallback hook intercepts API errors (429, 503, 529)
- Support for fallback_models from agent/category configuration
- Session-state TTL and periodic cleanup to prevent memory leaks
- Robust agent name detection with explicit AGENT_NAMES array
- Session category registry for category-specific fallback lookup

Schema changes:
- Add RuntimeFallbackConfigSchema with enabled, retry_on_errors, max_fallback_attempts,
  cooldown_seconds, notify_on_fallback options
- Add fallback_models to AgentOverrideConfigSchema and CategoryConfigSchema
- Add runtime-fallback to HookNameSchema

Files added:
- src/hooks/runtime-fallback/index.ts - Main hook implementation
- src/hooks/runtime-fallback/types.ts - Type definitions
- src/hooks/runtime-fallback/constants.ts - Constants and defaults
- src/hooks/runtime-fallback/index.test.ts - Comprehensive tests
- src/config/schema/runtime-fallback.ts - Schema definition
- src/shared/session-category-registry.ts - Session category tracking

Files modified:
- src/hooks/index.ts - Export runtime-fallback hook
- src/plugin/hooks/create-session-hooks.ts - Register runtime-fallback hook
- src/config/schema.ts - Export runtime-fallback schema
- src/config/schema/oh-my-opencode-config.ts - Add runtime_fallback config
- src/config/schema/agent-overrides.ts - Add fallback_models to agent config
- src/config/schema/categories.ts - Add fallback_models to category config
- src/config/schema/hooks.ts - Add runtime-fallback to hook names
- src/shared/index.ts - Export session-category-registry
- docs/configurations.md - Add Runtime Fallback documentation
- docs/features.md - Add runtime-fallback to hooks list

Supersedes code-yeongyu#1237, code-yeongyu#1408
Closes code-yeongyu#1408
@youming-ai youming-ai force-pushed the feat/runtime-fallback-only branch from 5e38d71 to 62fac11 Compare February 9, 2026 15:09
Copy link

@cubic-dev-ai cubic-dev-ai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

1 issue found across 5 files (changes from recent commits).

Prompt for AI agents (all issues)

Check if these issues are valid — if so, understand the root cause of each and fix them.


<file name="src/hooks/runtime-fallback/index.ts">

<violation number="1" location="src/hooks/runtime-fallback/index.ts:134">
P2: Regex word boundary causes "sisyphus" to match inside "sisyphus-junior", leading to incorrect agent detection when session IDs contain hyphenated agent names.</violation>
</file>

Reply with feedback, questions, or to request a fix. Tag @cubic-dev-ai to re-run a review.

marlon-costa-dc pushed a commit to marlon-costa-dc/oh-my-opencode that referenced this pull request Feb 9, 2026
…ation

**Source**: code-yeongyu#1408
**Author**: (contributor)
**Branch**: origin/feat/runtime-fallback-only → dev

## Feature Overview

Adds `fallback_models` configuration field to agent overrides and category configs. When primary model fails (rate limit, overload, network error), automatically falls back to specified alternative models. Enables resilient agent execution across provider outages.

## Modular Architecture Adaptation

This merge adapts the runtime-fallback feature to upstream's modular schema architecture (commit 3d5abb9):

### Files Added (Modular Pattern)
- `src/config/schema/runtime-fallback.ts` - Extracted fallback schema
  - `FallbackModelsSchema`: union of string or string array
  - Follows upstream pattern: isolated feature schema with barrel export

### Files Modified
- `src/config/schema.ts`
  - Added: `export * from "./schema/runtime-fallback"` (barrel export)
- `src/config/schema/agent-overrides.ts`
  - Added: `FallbackModelsSchema` import
  - Added: `fallback_models: FallbackModelsSchema.optional()` to AgentOverrideConfigSchema
- `src/config/schema/categories.ts`
  - Added: `FallbackModelsSchema` import
  - Added: `fallback_models: FallbackModelsSchema.optional()` to CategoryConfigSchema
- `src/shared/normalize-fallback-models.ts` (if exists)
  - Utility function for normalizing fallback model format
- `src/tools/delegate-task/executor.ts`
  - Updated to use fallback_models from agent/category config

## Patch Hierarchy

This feature is **self-contained** and has NO dependencies on other feature PRs.

**Dependencies**: None (standalone feature)
**Required by**: None (other features don't depend on this)

## Verification

```bash
bun run typecheck  # ✅ MUST pass with 0 errors
bun run build      # ✅ MUST succeed
```

**Tested**: Each modification maintains modular architecture compliance.
marlon-costa-dc pushed a commit to marlon-costa-dc/oh-my-opencode that referenced this pull request Feb 9, 2026
…r, definition-gates)

**Source**: code-yeongyu#1188
**Author**: agno01 (contributor)
**Branch**: origin/feat/mobius-loop-hooks → dev

## Feature Overview

Adds two Mobius Loop-inspired hooks to improve agent efficiency and quality:
1. **loop-detector**: Detects infinite loops (repeated tool calls, error loops, read-edit cycles) and injects stop warnings.
2. **definition-gates**: Enforces Definition of Ready (DoR) before delegation and Definition of Done (DoD) reminders on task completion.

## Conflict Resolution & Integration

Resolved conflicts between feature branch (monolithic structure) and dev branch (modular structure):

### Modular Hook Wiring
- **src/config/schema/hooks.ts**: Added `loop-detector`, `definition-gates`, and `runtime-fallback` to `HookNameSchema`.
- **src/config/schema/oh-my-opencode-config.ts**: Added `runtime_fallback` configuration schema.
- **src/hooks/index.ts**: Merged exports for all new hooks (including `runtime-fallback` from previous merge).
- **src/plugin/hooks/create-tool-guard-hooks.ts**: Wired all 3 hooks into the tool guard layer:
  - `runtime-fallback` (guards against model errors)
  - `loop-detector` (guards against execution loops)
  - `definition-gates` (guards against undefined tasks)

### Missing Wiring Fixed
- Wired up `runtime-fallback` hook (from PR code-yeongyu#1408) which was merged previously but not registered in the hook creation system.

## Verification

```bash
bun run typecheck  # ✅ Passed (0 errors)
bun test src/hooks/loop-detector/index.test.ts      # ✅ Passed
bun test src/hooks/definition-gates/index.test.ts   # ✅ Passed
bun test src/hooks/runtime-fallback/index.test.ts   # ✅ Passed
```
@thewildofficial
Copy link

we merging this? what's the blocker currently

Implements runtime model fallback that automatically switches to backup
models when the primary model encounters transient errors.

Features:
- Add runtime_fallback config section (enabled, retry_on_errors, max_fallback_attempts, cooldown_seconds, notify_on_fallback)
- Add fallback_models field to agent overrides and categories
- Implement runtime-fallback hook intercepting session.error and assistant message errors
- Session-state TTL (30-min inactivity) with 5-min sweeps
- Per-model cooldown prevents rapid cycling

Integration:
- Wire hook into plugin event and chat.message dispatchers
- Register SessionCategoryRegistry on task start/resume for category lookup
- Clean up registry entries on task completion

Error detection:
- HTTP status codes: 400 (Anthropic credit), 429, 503, 529
- Error patterns: rate limit, quota exceeded, overloaded, credit balance, billing

Co-authored-by: cubic AI <cubic@code-yeongyu.com>
@youming-ai youming-ai force-pushed the feat/runtime-fallback-only branch from 3a4f612 to 4396ac5 Compare February 12, 2026 07:19
@youming-ai
Copy link
Contributor Author

@code-yeongyu recheck

@shrimpwtf
Copy link

would love to see this merged, really frustrating without the fallbacks. the code clearly defines a fallback structure but only tries the models in config

@unphased
Copy link

unphased commented Feb 16, 2026

my omo went off and found oh-my-opencode-slim specifically because i asked it to explore what options i might have to give me the ability to dynamically switch models. it's such a pain point with all of the stuff going around like accounts getting limited and/or banned and weird unpredictable behavior from different vendors being able to throttle your accounts to hell and then suddenly being fine later. being able to dynamically fallback and naturally recover is critical, because closing OC to reopen it and loading a session after tweaking config manually is such a nonviable workflow it's not even funny.

And it turns out oh-my-opencode-slim lets you configure fallback models but as far as I can tell also is incapable of dynamically falling back during a session, which makes it entirely useless. I'm keeping it though to use for light use cases because i like short prompts and saving tokens.

@prolific
Copy link

Any chance this PR can be prioritized? Eagerly waiting for this.

@code-yeongyu
Copy link
Owner

Closing: superseded by #1777, which includes this fix plus 9 additional bug fixes. Thanks!

code-yeongyu pushed a commit that referenced this pull request Feb 20, 2026
- Fix bun.lock version conflicts (3.3.1 -> 3.3.2)
- Remove Git conflict markers from docs/configurations.md
- Remove duplicate normalizeFallbackModels, import from shared module
code-yeongyu pushed a commit that referenced this pull request Feb 20, 2026
Implements runtime model fallback that automatically switches to backup models
when the primary model encounters transient errors (rate limits, overload, etc.).

Features:
- runtime_fallback configuration with customizable error codes, cooldown, notifications
- Runtime fallback hook intercepts API errors (429, 503, 529)
- Support for fallback_models from agent/category configuration
- Session-state TTL and periodic cleanup to prevent memory leaks
- Robust agent name detection with explicit AGENT_NAMES array
- Session category registry for category-specific fallback lookup

Schema changes:
- Add RuntimeFallbackConfigSchema with enabled, retry_on_errors, max_fallback_attempts,
  cooldown_seconds, notify_on_fallback options
- Add fallback_models to AgentOverrideConfigSchema and CategoryConfigSchema
- Add runtime-fallback to HookNameSchema

Files added:
- src/hooks/runtime-fallback/index.ts - Main hook implementation
- src/hooks/runtime-fallback/types.ts - Type definitions
- src/hooks/runtime-fallback/constants.ts - Constants and defaults
- src/hooks/runtime-fallback/index.test.ts - Comprehensive tests
- src/config/schema/runtime-fallback.ts - Schema definition
- src/shared/session-category-registry.ts - Session category tracking

Files modified:
- src/hooks/index.ts - Export runtime-fallback hook
- src/plugin/hooks/create-session-hooks.ts - Register runtime-fallback hook
- src/config/schema.ts - Export runtime-fallback schema
- src/config/schema/oh-my-opencode-config.ts - Add runtime_fallback config
- src/config/schema/agent-overrides.ts - Add fallback_models to agent config
- src/config/schema/categories.ts - Add fallback_models to category config
- src/config/schema/hooks.ts - Add runtime-fallback to hook names
- src/shared/index.ts - Export session-category-registry
- docs/configurations.md - Add Runtime Fallback documentation
- docs/features.md - Add runtime-fallback to hooks list

Supersedes #1237, #1408
Closes #1408
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants